Improved speech reading through a free-parts representation
نویسندگان
چکیده
Motivated by the success of free-parts based representations in face recognition [1] we have attempted to address some of the problems associated with applying such a philosophy to the task of speaker-independent automatic speech reading. Hitherto, a major problem with canonical area-based approaches in automatic speech reading is the intrinsic lack of training observations due to the visual speech modality’s low sample rate and large variability in appearance. We believe a free-parts representation can overcome many of these limitations due to its natural ability to generalize by producing many observations from a single mouth image, whilst still preserving the ability to discriminate between various visual-speech units. This approach additionally requires a modification to traditional techniques employed for the estimation of hidden Markov Models (HMMs), whose resultant models we currently refer to as free-parts HMMs (FP-HMMs). Results will be presented on the CUAVE audiovisual speech database.
منابع مشابه
Reading through a Free - Parts Representation
Motivated by the success of free-parts based representations in face recognition [1] we have attempted to address some of the problems associated with applying such a philosophy to the task of speaker-independent automatic speech reading. Hitherto, a major problem with canonical area-based approaches in automatic speech reading is the intrinsic lack of training observations due to the visual sp...
متن کاملFluent Aphasia From Herpes Simplex Encephalitis
The present case report introduces a patient with fluent aphasia, anterograde amnesia and anosmia due to herpes simplex encephalitis after her first delivery. The left medial temporal lobe was one of the main areas involved. On aphasia testing she showed severe anomia on both confrontation and free recall, agraphia, alexia, repetition disorder and some auditory comprehension impairments. Therap...
متن کاملبررسی شاخص های کیفیت گفتار در کودکان فارسی زبان طبیعی 5-4 ساله در شهرهای سمنان، بیرجند و تنکابن، سال 1383
Background and purpose: We can examine the language abilities of a person through five parameters of speech quality including speech fluency, speech complexity, speech exactness, speech rate and lexical accessibility. These parameters are examined by the secondary parameters including mean length of utterance (MLÜ), mean length of five long utterances, mean number of verb in sentence, mean nu...
متن کاملDesigning and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کاملIntegrating monolithic and free-parts representations for improved face verification in the presence of pose mismatch
Face images, varying under pose, are dramatically different in their “pixel” appearance even if they stem from the same subject. Our work concentrates specifically on the task of verifying faces when the gallery set stems from frontal face images, with the probe set stemming from a number of alternate poses (i.e. pose mismatch). An argument is put forward for attempting to recognize faces throu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005